Nvidia GeForce RTX 4090 review: Brutally, futuristically fast | PCWorld

2022-10-16 19:58:36 By : Ms. Angela Yang

When you purchase through links in our articles, we may earn a small commission. This doesn't affect our editorial independence.

Nvidia’s monstrous GeForce RTX 4090 delivers luxuriously fast frame rates and futuristic features, but DLSS 3’s AI speed boost may be the real star. It’s a behemoth of a GPU that draws a lot of power, but Nvidia’s sublime Founders Edition design remains cool, quiet, and eye-catching.

Nvidia’s $1,599 GeForce RTX 4090 graphics card sports a luxury price tag, but you get a truly luxurious gaming experience in return.

Want face-melting frame rates? The last-generation RTX 3090 was a speed demon, but the GeForce RTX 4090—thanks to Nvidia’s new “Ada Lovelace” architecture—screams through traditional games up to 83 percent faster. Nvidia’s 40-series flagship chews through futuristic ray-traced games too, propelled by more advanced 3rd-gen RT cores and a radical new “Frame Generation” DLSS 3 feature that doubles (or more) frame rates yet again with a heaping helping of AI.

It’s not just about games, though. Creators will love the 24GB of blazing-fast GDDR6X memory when it comes to pixel-packed video renders, and Nvidia’s inclusion of not one, but two AV1 video encoders means the RTX 4090 is suitably equipped to handle the future of streaming.

No matter what you want to accomplish, the monstrous GeForce RTX 4090 Founders Edition can handle it without breaking a sweat—though your wallet and power supply might, as all this extra performance doesn’t come free. Let’s dig in.

[ Further reading: Where to buy Nvidia’s GeForce RTX 4090 ]

This first GPU of a new generation provides our first glimpse at Nvidia’s new Ada Lovelace architecture, running on a custom TSMC 4N (read: fancy 5nm) process. It’s a ferocious jump. Before we start, here’s a high-level look at how the RTX 4090’s specifications compare to prior-gen GeForce flagships (which amusingly all sport different tier classifications):

The leap from Samsung’s 8N process to TSMC’s much more advanced 4N process can’t be understated. Last-generation’s full-fat RTX 3090 Ti packed 10,752 CUDA graphics cores and 28.3 billion transistors into a massive 628.4mm die. The new RTX 4090 ups that to a ludicrous 16,384 CUDA cores and 76.3 billion transistors on a smaller die. Thanks, TSMC!

But Nvidia worked its magic, too. While the bones of the Ada Lovelace architecture don’t stray that wildly from last-gen’s Ampere design—though there’s a lot more of everything—Nvidia made some key tweaks to push performance even further. One example is a new Shader Execution Reordering function, which groups and schedules similar shading work on the fly. It boosts performance in games running traditionally rasterized graphics, sure, but accelerates ray tracing even more.

Enhancing ray tracing performance was clearly a focus for Nvidia with the RTX 4090. The Ada Lovelace architecture upgrades to third-generation RT cores and fourth-gen tensor AI cores, all faster and more efficient than before. They’re augmented by a beefed-up Optical Flow Accelerator that, in tandem with the tensor cores, unlocks DLSS 3 and its new Frame Generation feature.

Nvidia’s DLSS 2 already stood strong as the gold standard for image upscaling, but DLSS 3 goes even further by using tensor cores to create fully AI-created frames that slot in between traditionally rendered frames, which can drastically increase performance—even in CPU bound games, thanks to the way it works.

It all adds up. Without ray tracing active, the RTX 3090 could run Cyberpunk 2077 at 48 frames per second. The RTX 4090 gets 77fps out of the gate but flipping on DLSS 3 lets it soar all the way to 138fps—and that’s with strenuous ray tracing effects active. It’s tremendously impressive. There are some key DLSS 3 details to be aware of, though. We’ll dive much deeper into DLSS 3 and ray tracing later in our performance benchmarks.

Intel beat Nvidia to the AV1 encoding punch with its debut Arc graphics cards—which are also launching on October 12, concurrent to the RTX 4090—but Nvidia is taking the futuristic video technology a step further by integrating two AV1 encoders into its new NVENC media engine. Nvidia says its AV1 encoding implementation is 40 percent more efficient than its already best-in-class H.264 encoding, and our deep-dive into Arc’s AV1 encoding goes into what this means for the future of streaming video.

Other, more traditional metrics also improved. In addition to Ada packing in simply more of everything, clock speeds improved dramatically, to a 2,520MHz rated Boost Clock (though the RTX 4090 can dynamically clock much higher than that in games). That’s roughly 700MHz faster than the 3090 Ti. The new RTX 4090 also integrates the faster 21Gbps GGDR6X memory modules introduced in the 3090 Ti, letting the RTX 4090’s 24GB of memory blaze along at over a terabyte of overall memory bandwidth each and every second. Cue Keanu Reeves: Whoa.

Adding all these capabilities requires more power, however. While the GeForce RTX 4090 ostensibly packs the same 450W total graphics power rating as the 3090 Ti, real-world power use comes in a bit higher, and Nvidia adopted the new 12VHPWR 16-pin cable for ATX 3.0 power supplies, which is designed to handle higher GPU power needs. You don’t need a new ATX 3.0 power supply to run the 4090, however (though you do need an 850W minimum PSU). Nvidia includes a 12VHPWR 16-pin adapter in the box that hooks up to four traditional 8-pin power connectors—three mandatory, with the fourth required for overclocking endeavors.

Nvidia’s 12VHPWR adapter winds up looking cluttered in practice, and would have been more aesthetically pleasing with longer cables. 

It works, but it’s an ugly kludge around an otherwise sterling Founders Edition design. Corsair sent us its HX1500i power supply and optional $20 12VHPWR 600 cable for testing, and using that native cable as opposed to the adapter looks much more aesthetically pleasing. If your PSU manufacturer offers a 12VHPWR cable for a reasonable price, I highly recommend jumping on it. You don’t want your $1,600 graphics card looking janky, after all.

Nvidia’s “new” Founders Edition design largely mirrors the RTX 30-series aesthetic and push-pull “flow through” cooling system. That’s a good thing. These metal-clad cards look outstanding, albeit massive. Nvidia tweaked things under the hood to improve cooling performance for this supremely powerful graphics card, however. The GeForce RTX 4090 Founders Edition sports bigger fans with fluid dynamic bearings now, paired with a taller, redesigned heatfin stack. Nvidia claims the RTX 4090 FE “delivers the highest airflow ever measured in a discrete Nvidia GPU, achieving 15 percent more airflow than an RTX 3090 at the same acoustic level.” In practice, the RTX 4090 FE runs utterly cool and utterly quiet. It rocks, though custom GeForce RTX 4090 cards are also available.

Of course, the GeForce RTX 4090 supports Nvidia’s killer software stack, from NVENC to Broadcast to Shadowplay to Nvidia Reflex (which is actually a core component of DLSS 3). Software and features remain a key strength for GeForce, especially in contrast to Intel’s spotty debut Arc drivers.

But enough talk. Let’s get to the benchmarks. Giddy-up.

We test graphics cards on a top-of-the-line AMD Ryzen 5900X PC used exclusively for benchmarking GPUs. But there’s one new tweak for the start of this GPU generation: We’ll now test with PCIe Resizable BAR (also known as Smart Access Memory on Ryzen systems) active, as most modern gaming PCs released in the last four years support the performance-boosting feature, either natively or via a motherboard firmware update.

Nvidia also recommends turning on the optional “Hardware-accelerated GPU scheduling” option in Windows to let the RTX 40-series stretch its legs to the fullest, so we’ve made that tweak as well. Most of the hardware was provided by the manufacturers, but we purchased the storage ourselves. Special thanks to Corsair, which sent us a HX1500i power supply and optional $20 12VHPWR 600 cable for testing the Founders Edition.  

We’re comparing the $1,599 GeForce RTX 4090 Founders Edition against its direct predecessor, the $1,500 GeForce RTX 3090 Founders Edition, as well as AMD’s rival Radeon RX 6900 XT, which launched at $1,000 but provided similar performance to the 3090 in rasterized games. Nvidia and AMD both released more powerful variants of those GPUs, in the form of the RTX 3090 Ti and Radeon RX 6950 XT, but we don’t have those graphics cards on-hand for direct comparison with new drivers.  

In addition to turning on PCIe Resizable BAR by default, we’re also moving to a new set of games in our testing suite. We test a variety of games spanning various engines, genres, vendor sponsorships (Nvidia, AMD, and Intel), and graphics APIs (DirectX 9, 11, DX12, and Vulkan), to try to represent a full range of performance potential. Each game is tested using its in-game benchmark, sanity checked by Nvidia’s FrameView tool, at the highest possible graphics presets unless otherwise noted, with VSync, frame rate caps, real-time ray tracing or DLSS effects, and FreeSync/G-Sync disabled, along with any other vendor-specific technologies like FidelityFX tools or Nvidia Reflex. We’ve also enabled temporal anti-aliasing (TAA) to push these cards to their limits. We run each benchmark at least three times and list the average result for each test.

Let’s kick things off with a pair of ultra-popular esports games, and a tactical strategy game running on DX11.

The GeForce RTX 4090 is so fast that it can sometimes run into CPU bottlenecks even playing at 4K resolution with the eye candy maxed in some games, but esports games and older DirectX 11 titles don’t necessarily push it to its blistering limits. The 4090 is just 36 percent faster than the 3090 in Total War: Troy—a solid result, but one dwarfed by the advances in DX12 and Vulkan games. Likewise, while the RTX 4090 easily tops the CS:GO benchmark charts, the meager frame rate improvements at 4K and 1080p show that this long-established DirectX 9 title doesn’t have much more to give to Ada Lovelace’s might.

Rainbow Six Siege running on the modern Vulkan API shows a much more substantial uplift—running 64 percent faster than the RTX 3090 at 4K—and includes Nvidia’s latency-dropping Reflex technology to make it even more responsive. Notice the hard engine or CPU bottleneck it hits at 1440p resolution (and perhaps even 4K) however.

That Vulkan uplift holds true when we widen our scope to examine a variety of Vulkan and DX12 games.

Here, the GeForce RTX 4090 regularly performs around 55 to 60 percent faster than its predecessor at 4K resolution, but that can climb up to as high as 83 percent faster in particularly friendly games like Hitman 3 and F1 22. Gears Tactics runs 67 percent faster on the 4090, but it could’ve been even more in theory—the game wound up being CPU-bound 7 percent of the time while running the benchmark, a feat you don’t see often at 4K/Ultra. The GeForce RTX 4090 is that fast.

Nvidia invested heavily in boosting Ada Lovelace’s ray tracing chops, equipping it with new Shader Execution Reordering capabilities, third-gen RT cores, fourth-gen tensor AI (DLSS) cores, and an Optical Flow Accelerator to supercharge DLSS 3’s performance uplift.

We’ll dig into DLSS 3 next, but first, we wanted to dig into the RTX 4090’s raw ray tracing performance. Intel’s debut Arc GPUs operate at a much lower level of raw power, being roughly $300 graphics cards, but their first crack at ray tracing efficiency managed to surpass the second-gen RT cores in Nvidia’s RTX 30-series. Nvidia’s Lovelace upgrades claws the crown back.

To reveal Lovelace’s prowess, the charts below show the raw performance of the RTX 4090, 3090, and Radeon RX 6900 XT in the selected games with ray tracing off, to get a baseline. Next, we show performance with any and all available ray tracing effects active and set to ultra. The important thing to note here is how these ray-traced frame rates compare proportionally to the game running with RT off, which reveals how efficient each GPU’s raw ray tracing capabilities are. (If a game runs at 100fps with ray tracing off, and two different GPUs run the game with ray tracing on at 50fps and 25fps, for example, the former GPU offers better raw ray tracing chops).

Finally, since ray tracing sends frame rates plummeting, pairing it with some sort of image upscaling technology (like Nvidia’s DLSS, AMD’s FSR 2, and Intel’s XeSS) is a necessity. We’ve also included performance metrics with DLSS running in balanced mode on Nvidia GPUs to show the sort of full-stack ray tracing performance you’ll see in the real world. (Nvidia doesn’t allow DLSS to run on rival graphics cards, so the AMD Radeon RX 6900 XT is represented by a zero in the DLSS measurements.) Running DLSS in Performance or Ultra Performance mode would result in even higher frame rates, but with potentially more image oddities. We like the balance of speed and fidelity that DLSS Balanced provides.

We’ve also included a performance chart showing Red Dead Redemption 2 performance with DLSS Balanced on the various Nvidia GPUs. That game lacks ray tracing, but these results can show us if Ada Lovelace’s new fourth-gen tensor cores are more efficient than their predecessor.

No surprises here: Nvidia kickstarted the ray tracing revolution and has invested heavily in the ecosystem, and the RTX 4090 suitably smashes all comers. It can run Hitman 3 with ray tracing maxed-out while still surpassing 60 frames per second even at 4K, and that’s without DLSS running whatsoever. Turn on DLSS and you can play Cyberpunk 2077 with all its beautifully strenuous ray tracing effects tuned to ultra at a blistering 84fps pace at 4K. Even with ray tracing off, the last-gen RTX 3090 can’t crack 50fps at Ultra settings.

The GeForce RTX 4090 and Ada Lovelace make playing ray traced games practical—but it’s DLSS 3 that offers some truly special sauce.

Finally, the true feather in Nvidia’s cap: DLSS 3. Introduced alongside the RTX 40-series, DLSS 3 builds atop the framework already built by Nvidia’s existing features to massively accelerate performance with the help of AI-generated frames. It’s a radical and exciting new technology, capable of speeding up even CPU-bound games, though there are caveats worth knowing.

Nvidia’s first two generations of DLSS were a binary feature. You could choose from different quality presets, of course, but at its bones DLSS has been either on or off. That changes with DLSS 3, which actually consists of three different features: DLSS Super Sampling, DLSS Frame Generation, and Nvidia Reflex. Turning on DLSS in a game that supports DLSS 3—Nvidia already announced 35 DLSS 3 games coming down the pipe—reveals options for all three of those complementary features.

DLSS Super Resolution is easy. It’s DLSS. Turning it on reveals the usual Quality, Balanced, Performance, and Ultra Performance options. DLSS (and all upsampling features) works by having the GPU render a frame at a lower resolution, then upsampling it to fit your monitor’s resolution, using software tricks that vary by implementation. Since your GPU is internally rendering, say, a 1080p image rather than a full 4K image, performance soars. DLSS, AMD’s FSR 2, and Intel’s XeSS technologies all deliver similar enough results, with AMD managing to perform the same tricks without the help of dedicated AI hardware.

But AMD has no answer for DLSS Frame Generation, DLSS 3’s secret sauce. DLSS 3 leverages the RTX 4090’s tensor cores and Ada Lovelace’s advanced Optical Flow Sensor to create fully AI-generated frames, interspersed between every other GPU-rendered frame.

If you’re using DLSS Super Resolution to upscale a game from 1080p to 4K, and activate Frame Generation, only one out of every eight pixels shown in a pair of frames were actually rendered on the GPU shaders, which obviously supercharges performance. “The DLSS Frame Generation convolutional autoencoder takes 4 inputs – current and prior game frames, an optical flow field generated by Ada’s Optical Flow Accelerator, and game engine data such as motion vectors and depth,” to work its magic, Nvidia’s DLSS 3 introductory page states. It’s highly recommended reading if you want more nitty-gritty details on how DLSS 3 functions.

These AI frames are fully generated by the tensor cores, using the above inputs to inform the output. Digital Foundry performed an early frame-by-frame analysis on DLSS 3 and found some visual inconsistencies in those alternating AI frames, but with the screaming frame rates DLSS 3 Frame Generation provides, I didn’t see any abnormalities in action while tooling around in Cyberpunk 2077 and Microsoft Flight Simulator. You really need to hunt and pixel peep to see any weirdness. That said, due to the way DLSS Frame Generation works, the effect could be more pronounced in fast-paced games with lots of camera motion.

Dropping AI frames between rendered frames accomplishes two things. First, it sends frame rates screaming, as you see in our benchmarks below. But since those AI frames are generated completely on the tensor cores, without any input from the rest of your PC needed, it can also supercharge performance even in CPU-bound games, which normally stall out because your CPU can’t keep up. Flight Simulator is notoriously CPU-bound, but activating DLSS 3 Frame Generation instantly doubled our frame rates.

Yes, with DLSS 3, you can actually play Flight Sim at a glorious 120Hz on a 4K display. Droooool. And for additional context to the Cyberpunk 2077 slide, the RTX 3090 “only” managed to get about 48 frames per second at 4K with RT Ultra and DLSS 2 Super Resolution active.

But here’s the thing: Weaving in those AI-generated frames also creates CPU backpressure and can actually add more latency, reducing responsiveness. The AI frames make your game look buttery smooth, but during those frames, the game isn’t actually responding to your inputs, since the GPU and CPU aren’t doing the work. To combat this, Nvidia mandates DLSS 3 games to also support Nvidia Reflex.

Originally introduced as a way to reduce latency and improve responsiveness in esports games, Reflex zeroes out a game’s render queue, reducing CPU backpressure by telling it to supply frames to the GPU in a just-in-time basis. Because the CPU isn’t under stress to supply a render queue, it can keep an eye out for mouse clicks until the last possible second, too. Our testing found it ferociously effective. Pairing Reflex with DLSS Frame Generation helps keep responsiveness largely identical and can even lower latency versus native performance in some games. It’s a delightful combo, and I’m pumped to see DLSS 3 drag Reflex kicking and screaming into a much wider arsenal of games.

That said, while the DLSS 3 experience shines, it doesn’t always feel more responsive in your hands. Running Cyberpunk 2077 at 138fps with every possible bell and whistle active thanks to DLSS 3 looks absolutely stunning, full stop. But one of the perks of faster frame rates is increased responsiveness, and while Reflex greatly helps with that, Cyberpunk 2077 running at 138fps doesn’t feel substantially more responsive than running it at the 77fps or so it hits in native 4K. Having twice as many frames looks fantastic but those AI frames aren’t responding to your controls in the actual moment, though it’s close.

This is not a negative. This is not a drawback. It can’t be after running Flight Sim at 120fps, and Cyberpunk 2077 with ultra ray tracing at 138fps (the RTX 3090 tops out at 48fps). But it’s worth highlighting. DLSS 3 work stunningly in these slower-paced games we’ve tried, but if you’re looking for maximum responsiveness from sweaty competitive titles, you’d probably be better off disabling Frame Generation while activating Reflex and DLSS Super Resolution.

We need to spend much more time testing DLSS 3 Frame Generation in action before we make more concrete recommendations. If you’re inclined to do so yourself, a new iteration of Nvidia’s fantastic FrameView benchmarking tool includes new easy-to-read averages for 0.1 percent lows and overall PC latency, in addition to the usual stats. An impressive 35 games have already pledged to support DLSS 3, and Nvidia sent over the following notes about launch timing in specific games:

We test power draw by looping the F1 22 benchmark at 4K for about 20 minutes after we’ve benchmarked everything else (to warm up the GPU) and noting the highest reading on our Watts Up Pro meter, which measures the power consumption of our entire test system. The initial part of the race, where all competing cars are onscreen simultaneously, tends to be the most demanding portion. 

This isn’t a worst-case test. This is a GPU-bound game running at a GPU-bound resolution to gauge performance when the graphics card is sweating hard. If you’re playing a game that also hammers the CPU, you could see higher overall system power draws. Consider yourself warned.

The RTX 3090 used a lot of power. The GeForce RTX 4090 and its swanky new 12VHPWR power cable uses even more. No surprises there. You don’t buy a graphics card like this if you’re worried about energy use, you buy it for peak performance.

We test thermals by leaving GPU-Z open during the F1 22 power draw test, noting the highest maximum temperature at the end.

Nvidia’s ultra-thick, metal-clad Founders Edition design shines yet again. Despite having to tame an unprecedented amount of raw power, the RTX 4090 FE stays whisper quiet and runs ultra-cool.

Is the 4090 for you? Probably not. Most people shouldn’t spend $1,600 on a graphics card, just like they shouldn’t spend hundreds of thousands of dollars on a Lambo. Lambos exist for a reason, however. If you want peak performance no matter the price, you’ll be spectacularly pleased with the GeForce RTX 4090 Founders Edition. This is the first graphics card capable of maxing out a 120Hz 4K monitor in many modern games—a monumental achievement.

The GeForce RTX 4090 embarrasses all previous GPU contenders in all games, full stop. The uplift isn’t quite as convincing in esports and DirectX11 titles, but the victories are there, and in games running the more modern DX12 and Vulkan APIs, the RTX 4090 is anywhere from 55 to 83 percent faster than the RTX 3090. That’s on par with the RTX 3080’s uplift over the RTX 2080.

This GPU is so fast, we witnessed some games suffering from CPU bottlenecks even at graphics-heavy 4K resolution. It screams. Dropping down to 1440p results in still sterling, but less impressive generational results, as the 4090’s might results in more CPU and game engine bottlenecks at the lower resolution. You really want to buy this for use with a 4K or ultrawide monitor with a blazing-fast refresh rate. If you have a 60Hz 4K monitor, or a 1440p monitor, prior-gen GPUs still deliver plenty of oomph for a lot less money.

Ray tracing is where the GeForce RTX 4090 truly shines. Nvidia’s Ada Lovelace architecture was optimized for these futuristic lighting effects, and the RTX 4090 can play 4K games with every graphics setting—RT and otherwise—cranked to 11 while well shattering the hallowed 60 frames per second mark at 4K with DLSS enabled. Again, that’s a monumental achievement.

And DLSS 3’s wondrous AI Frame Generation feature can send frame rates soaring even higher in games that support it, although it doesn’t always make games feel proportionally more responsive. Cyberpunk 2077 runs at 48 frames per second with DLSS active on last-gen’s RTX 3090; on the RTX 4090 with DLSS 3 Frame Gen active, it hits a blistering 138fps. That’s nearly three times as fast and an unrivaled visual experience. Other GPUs hover around 60fps in Microsoft Flight Simulator thanks to its CPU bottlenecking, but DLSS Frame Gen doubles that. We’ll need to see if Frame Gen’s fidelity and responsiveness holds up over a wider spread of games, but in these early days, DLSS 3 works like black magic. It’s wonderful and coming soon to 35 games. Nvidia is doing impressive stuff with its software features than AMD has yet to match.

We’re aiming to have a separate examination of the RTX 4090’s content chops in the coming days, but creators will love this thing, just like they did the RTX 3090. The 24GB of ultra-fast GDDR6X memory paired with dual AV1 encoders will make fast work of rendering and encoding tasks.

Power draw went up a bit this generation, and you’ll need to use a new 12VHPWR power cable, but it’s worthwhile for this performance. Price went up slightly too, with the RTX 4090 costing $100 more than its predecessor. That stings but seems reasonable enough for this flagship GPU after years of inflation and pandemic-related supply woes. Nvidia’s awesome Founders Edition cooler returns in slightly revamped form. It’s huge yet cool, quiet, and incredibly attractive. The Founders Edition is so appealing it makes it harder to recommend custom designs.

Bottom line? The GeForce RTX 4090 is an absolute monster, delivering dominating performance built for the future of games and content creation. It’s not perfect, but the few mild drawbacks it has are less “flaws” and more “just what you have to deal with for this much speed.” If you’re willing to spend $1,600 on a graphics card, the GeForce RTX 4090 provides the luxurious gaming experience you’re looking for—and if DLSS 3 gains traction, it could be so much faster in time.

It’s a goliath start to Nvidia’s Ada Lovelace generation. AMD’s RDNA 3 Radeon GPUs will have their work cut out for them when they’re revealed on November 3. Here’s where you can buy the RTX 4090.

Editor’s note: This article originally published on October 11, but was updated October 12 to include links to our roundup of RTX 4090 models when the GPU hit the streets.

Brad Chacos spends his days digging through desktop PCs and tweeting too much. He specializes in graphics cards and gaming, but covers everything from security to Windows tips and all manner of PC hardware.